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Q ' Abstract 

5_^ ' This paper is concerned with dynamic resource allocation in a cellular wireless network with slow fading for 
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support of data traffic having heterogeneous transmission delay requirements. The multiple-input single-output 
(MISO) fading broadcast channel (BC) is of interest where the base station (BS) employs multiple transmit 
CsJ I antennas to realize simultaneous downlink transmission at the same frequency to multiple mobile users each 

having a single receive antenna. An information-theoretic approach is taken for characterizing capacity limits of the 
fading MISO-BC under various transmission delay considerations. First, this paper studies transmit optimization 
C/3 . at the BS when some users have delay-tolerant "packet" data and the others have delay-sensitive "circuit" data 

for transmission at the same time. Based on the convex optimization framework, an online resource allocation 
algorithm is derived that is amenable to efficient cross-layer implementation of both physical (PHY) -layer multi- 
antenna transmission and media-access-control (MAC) -layer multiuser rate scheduling. Secondly, this paper 
^^ ' investigates the fundamental throughput-delay tradeoff for transmission over the fading MISO-BC. By comparing 

CO ' the network throughput under completely relaxed versus strictly zero transmission delay constraint, this paper 

'^ ' characterizes the limiting loss in sum capacity due to the vanishing delay tolerance, termed the delay penalty, 

00 ' under some prescribed user fairness for transmit rate allocation. 

o' 



k> Index Terms 

C^ ' Broadcast channel (BC), fading channel, multi-antenna, throughput-delay tradeoff, dynamic resource alloca- 

tion, cross-layer optimization, convex optimization. 

I. Introduction 

In mobile wireless networks, communications typically take place over time-varying channels. When this time- 
variation or fading is "fast" such that the channel state information (CSI) is hardly obtainable at the transmitter, a 
classical approach for mitigating impairments of fading to transmission reliability is to apply diversity techniques, 
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such as coded diversity, antenna diversity, and path diversity. On the other hand, when the fading channel changes 
sufficiently "slowly" such that the transmitter is able to acquire the CSI, a general approach to compensate for 
the fading is dynamic resource allocation, whereby transmit resources such as power, bit-rate, antenna-beam and 
bandwidth are dynamically allocated based upon the fading distribution. Effective implementation of dynamic 
resource allocation usually requires joint optimization of both physical (PHY) -layer transmission and media- 
access-control (MAC) -layer rate scheduling in the classical communication protocol stack, and thus demands 
for a new cross-layer design methodology. 

One challenging issue to be addressed for dynamic resource allocation in tomorrow's wireless networks is 
how to meet with user's heterogeneous transmission quality-of-service (QoS) requirements. Among others, the 
demand for wireless high-speed connectivity for both delay-tolerant "packet" data and delay-sensitive "circuit" 
data is expected to rise significantly in the next decade. Therefore, study on both spectral and power efficient 
transmission schemes for support of heterogeneous delay-constrained data traffic becomes an important area for 
research. On the other hand, because tolerance for a larger transmission delay incurred to data traffic allows for 
more flexible transmit power and rate adaptation over time and thereby leads to a larger transmission throughput 
in the long term, there is in general a fundamental throughput-delay tradeoff associated with dynamic resource 
allocation over fading channels. Characterization of such fundamental tradeoff is another important research 
problem because it reveals the ultimate gain achievable by dynamic resource allocation under realistic transmission 
delay requirements. 

This paper is aimed to provide concrete answers to the aforementioned problems by considering the fad- 
ing broadcast channel (BC) that models the downlink transmission in a typical wireless cellular network. An 
information-theoretic approach is taken in this paper to address some fundamental limits of dynamic resource 
allocation for the fading BC under various transmission delay considerations. In particular, the fading multiple- 
input single-output (MISO) BC is considered where multi-antennas are equipped at the transmitter of the base 
station (BS), and single antenna at the receiver of each mobile user. Because of multi-antennas at the transmitter, 
spatial multiplexing can be used at the BS to support simultaneous transmission to mobile users at the same 
frequency, named space-division-multiple-access (SDMA). A slow-fading environment is assumed, and for sim- 
plicity, the block-fading (BF) channel model is adopted. It is further assumed that the BS has perfect user CSI 
at its transmitter, and is thus able to perform a centralized dynamic resource allocation based upon multiuser 



channel conditions. This paper's main contributions are summarized as follows: 

• This paper studies optimal dynamic resource allocation for the fading MISO-BC when both no-delay- 
constrained (NDC) packet data and delay-constrained (DC) circuit data are required for transmission at 
the same time. A cross-layer optimization approach is taken for jointly optimizing capacity-achieving multi- 
antenna transmission at the PHY-layer and fairness-ensured multiuser rate scheduling at the MAC-layer. A 
convex optimization framework is formulated for minimizing the average transmit power at the BS subject 
to both NDC and DC user rate constraints. A two-layer Lagrange-duality method is shown to be the key for 
solving this problem. Based on this method, a novel online resource allocation algorithm that is amenable 
to efficient cross-layer implementation is derived, and its convergence behavior is validated. 

• This paper investigates the fundamental throughput-delay tradeoff for the fading MISO-BC under optimal 
dynamic resource allocation. By taking the difference between the maximum sum-rate of users under NDC 
and DC transmission subject to the constraint that the rate portion allocated to each user needs to be regulated 
by the same prescribed rate-profile, the paper presents a novel characterization for the limiting loss in sum 
capacity due to the vanishing delay tolerance, termed the delay penalty, for the fading MISO-BC. Thereby, 
the delay penalty provides the answer to the following interesting question: Comparing no delay constraint 
versus zero-delay constraint for all users in the network, how much is the maximum percentage of throughput 
gain achievable for all users by optimal dynamic resource allocation? 

The capacity region under NDC or DC transmission for a fading single-input single-output (SISO) BC has been 
characterized in [1], [2], and for a fading SISO multiple-access channel (MAC) in [3], [4]. A similar scenario 
like in this paper with mixed NDC and DC transmission has also been considered in [5] for the single-user 
multiple-input multiple-output (MIMO) fading channel, and in [6] for the fading SISO-BC. The comparison 
of achievable rates between NDC and DC transmission has been considered in [7] for the fading MIMO-BC. 
However, none of the above prior work has considered transmit optimization with mixed NDC and DC data 
traffic for the fading MISO-BC, which is addressed in this paper. On the other hand, throughput-delay and 
power-delay tradeoffs for communications over fading channels by exploiting the combined CSI and data buffer 
occupancy at the transmitter have been intensively studied in the literature for both single-user and multiuser 
transmission (e.g., [8] and references therein). In contrast to prior work, this paper studies the throughput-delay 
tradeoff from a new perspective by characterizing the fundamental delay penalty in the network throughput owing 



to stringent (zero) transmission delay constraint imposed by all users. The concept of rate -profile, or its equivalent 
definitions for specifying some certain fairness in user rate allocation have also been considered for the SISO 
multiuser channel in [9], [10], and for the MIMO multiuser channel in [11], [12]. However, to the author's 
best knowledge, application of rate-profile for characterizing the delay penalty in a multiuser fading channel is a 
novelty of this paper. There has been recently a great deal of study on the real-time resource allocation algorithm, 
named proportional fair scheduling (PFS) (e.g., [13]-[16] and references therein), which maximizes the network 
throughput by exploiting the multiuser channel variation and at the same time, maintains certain fairness among 
users in rate allocation. However, PFS is unable to guarantee any prescribed user rate demand. In this paper, a 
novel online scheduling algorithm is proposed to ensure that all NDC and DC user rate demands are satisfied 
with the minimum transmit power consumption at the BS. 

The remainder of this paper is organized as follows. Section Jl] illustrates the fading MISO-BC model and 
provides a summary of known information-theoretic results for it. Section |lll] addresses the optimal cross-layer 
dynamic resource allocation problem for support of simultaneous transmission of heterogeneous delay-constrained 
traffic. Section |IV] characterizes the fundamental throughput-delay tradeoff for the fading MISO-BC. Section |V] 
provides the simulation results. Finally, Section |Vl] concludes the paper. 

Notation: This paper uses upper case boldface letters to denote matrices and lower case boldface letters to 
indicate vectors. For a square matrix S, \S\ and S^^ are its determinant and inverse, respectively. For any general 
matrix M, Af ^ denotes its conjugate transpose. I and indicate the identity matrix and the vector with all zero 
elements, respectively. ||a;|| denotes the Euclidean norm of a vector x. E„[-] denotes statistical expectation over 
the random variable n. M*^ denotes the M-dimensional real Euclidean space and M.^ is its non-negative orthant. 
(j-«a:xj/ jg jj^g space of X X y matrices with complex number entries. The distribution of a circularly-symmetric 
complex Gaussian (CSCG) vector with the mean vector x and the covariance matrix 51 is denoted by CM{x, S), 
and ~ means "distributed as". {x}+ denotes the non-negative part of a real number x. 

II. System Model 

A MISO-BC channel with K mobile users each having a single antenna and a fixed BS having M antennas 
is considered, as shown in Fig. [T] Because of multi-antennas at the transmitter, the BS is able to employ SDMA 
to transmit to multiple users simultaneously at the same bandwidth. It is assumed that the transmission to all 
users is synchronously divided into consecutive blocks, and the fading occurs from block to block but remains 



static within a block of symbols, i.e., a block-fading (BF) model. Furthermore, it is assumed that the fading 
process is stationary and ergodic. Let n be the random variable representing the fading state. At fading state n, 
the MISO-BC can be considered as a discrete-time channel represented by 
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where yk{n), hk{n) and Zfe(n) denote the received signal, the 1 x M downlink channel vector, and the receiver 
noise for user k, respectively, and x{n) E c^^i denotes the transmitted signal vector from the BS. It is assumed 
that Zj.{n) ~ CJ\f{0, 1), Vn, k. The transmitted signal x{n) can be further expressed as 
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where hk{n) € c^^i and Sk{n) represent the precoding vector and the transmitted codeword symbol for user 
k, respectively, at fading state n. It is assumed that each user employs the optimal Gaussian code-book with 
normalized codeword symbols, i.e., Sk{n) ^ CJ\f{0,l),yk,n, and the rate of code-book for user k at fading 
state n is denoted as rk{n). The allocated transmit power to user k at fading state n is denoted by Pk{n), and 
it can be easily verified that pk{n) = ||6fc(n)|p. The total transmit power from the BS at fading state n is then 
expressed as p{n) = Yl,k=iPkij^)- Assuming full knowledge of the fading distribution, the BS is able to adapt 
the transmission power pk {n) and rate r^ (n) (could be both zero for some fading state n) allocated to user k in 
order to exploit multiuser channel variations over time. Assuming a long-term power constraint (LTPC) p* over 
different fading states, the average transmit power at the BS needs to satisfy E„[p(n)] < p*. 

Supposing that p{n) is given, the achievable rates {rk{n)} of users need to be contained in the corresponding 
capacity region of the MISO-BC at fading state n, denoted by C^^(p(n), {hk{n)}). Characterization of C^^ will 
become useful later in this paper when the issue on how to dynamically allocate transmit power and user rates 
at different fading states is addressed. In many cases, it is more convenient to apply the celebrated duality result 
between the Gaussian BC and MAC [17] to transform the capacity region characterization for the original BC to 
that for its dual MAC. Assuming that in the dual SIMO-MAC of the original MISO-BC considered in this paper, 
each user employs the optimal Gaussian code-book of rate Rk{n) and has a transmit power qk{n), k = 1, . . . ,K, 
at fading state n, by [18] the capacity region of the dual SIMO-MAC at fading state n can be expressed as 
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where R{n) = [Ri{n),... ,i?x(™)]- The duality result [17] then states that an achievable rate region for the 
original MISO-BC at fading state n with total transmit power p{n) can be expressed as 

7^^(p(n),{^,(n)})= U Cr'^(te(n)},{/.t(n)}). (4) 

{qk{n)}:qi{n)+...+qK{n)<p{n) 

It was later shown in [19] that the above achievable rate region 7l^'~"{p{n),{hk{n)}) is indeed the capacity 
region C^'^{p{n),{hk{n)}) for the Gaussian BC. By applying the above results, it follows that any rate-tuple 
{rk{n)} that is achievable in the fading MISO-BC by the user power-tuple {pk{n)} is also achievable as {Rk{n)}, 
Rkin) = rk{n),\/k,n, in the dual fading SIMO-MAC by the corresponding user power-tuple {qkin)} provided 
that ^k=iPk{^) = ^k=i^k{n),yn. Note that the power allocation pk{n) for user k in the original BC is 
not necessarily equal to qk{n) in the dual MAC. The transforms between {qk{n)} and {pk{n)} as well as the 
corresponding precoding vectors {bk{n)} for the same set of achievable rates {^^(n)} and {Rk{n)} can be found 
in [17], and are thus omitted in this paper for brevity. 

III. Dynamic Resource Allocation under Heterogeneous Delay Constraints 

This section studies optimal dynamic resource allocation algorithms for the BF MISO-BC to support si- 
multaneous transmission of data traffic with heterogeneous transmit rate and delay constraints. First, Section 
nil- A I provides the problem formulation. Then, Section Ull-B I presents the solution based on the Lagrange-duality 
method of convex optimization. At last. Section IIII-CI derives an online algorithm that is suitable for real-time 
implementation of the proposed solution. 

A. Problem Formulation 

The following rule for transmission scheduling at the BS is considered. As illustrated in Fig.|2j each user's data 
arising from some higher layer application is first placed into a dedicated buffer. Periodically, the BS removes 
some of the data from each user's buffer, jointly encodes them into a block of symbols, and then broadcasts 
the encoded block to all users through the MISO-BC. For simplicity, it is assumed that all user's data arrive 
to their dedicated buffers synchronously at the beginning of each scheduling period. The data arrival processes 
of users are assumed to be stationary and ergodic, mutually independent, and also independent of their channel 
realizations. This paper considers two types of data traffic with very different delay requirements: One is the 



delay-tolerant packet data and the other is the delay-sensitive circuit data, for which the following assumptions 
are made: 

• For a user with packet data application, the data arrival process is not necessarily continuous in time and 
the amount of arrived data in each scheduling period may be variable. All data are stored in a buffer of 
a sufficiently large size such that data dropping due to buffer overflow does not occur. In order for the 
scheduler to optimally exploit the channel dynamics, the allocated transmit rate can be variable during each 
scheduling period. It is assumed that there is always a sufficient amount of backlogged data in the buffer 
for transmission. The scheduler needs to ensure that the transmit rate averaged over scheduling periods in 
the long run is no smaller than the average data arrival rate. However, the exact amount of delay incurred 
to transmitted data in the buffer is not guaranteed. 

• For a user with circuit data application, the data arrival process is continuous with a constant-rate during 
each scheduling period. The arrived data is stored in the buffer for only one scheduling period and then 
transmitted. Therefore, the amount of delay incurred to transmitted data is minimal. However, the scheduler 
needs to ensure a constant-rate transmission independent of channel condition. 

Let the users with packet data applications be represented by the set ^ndc where NDC refers to no-delay- 
constrained, and the users with circuit data applications represented by Udq where DC refers to delay-constrained. 
In this paper, we consider optimal dynamic resource allocation to minimize the average transmit power at the BS 
over different fading states subject to the constraint that all NDC and DC user rate demands are satisfied. Recall 
that rfc(n) denotes the rate assigned to user k by the scheduler at fading state n. For a NDC user, it is required that 
the average transmit rate E„[rfc(n)] over fading states needs to be no smaller than its average data arrival rate i?^. 
In contrast, for a DC user, the transmit rate rk{n) at any fading state n needs to satisfy its constant data arrival 
rate i?^. By considering the dual SIMO-MAC in Section |1I| with Rk{n) = rk{n),qk{n) = pk{n),yk,n, optimal 
allocation of transmit rates and powers at different n can be obtained by solving the following optimization 



problem (PI): 
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For the above problem, the objective function and all the constraints except dSjl are affine. It can also be verified 
from (O that C^^^ is a convex set with any given positive {qk{n)}. Therefore, Problem PI is a convex 
optimization problem [20] and thus can be solved using efficient convex optimization techniques, as will be 
shown next. 

B. Proposed Solution 

The Lagrange-duality method is usually applied when a convex optimization problem can be more conveniently 
solved in its dual domain than in its original form. In this paper, we also apply this method for solving Problem 
PI. The first step for the Lagrange-duality method is to introduce dual variables associated with some constraints 
of the original problem. For Problem PI that has multiple constraints, there are also various ways to introduce 
dual variables that might result in different dual problems. For the following proposed solution, dual variables 
are chosen with an aim to facilitate implementing it in the real time, as will be explained later in Section IIII-CI 

As a first step, a set of dual variables {/ifc}, ^fc > 0, A; G Z//ndc> are introduced for the NDC users with respect 
to (w.r.t.) their average-rate constraints in ^. The Lagrangian of Problem PI can be then expressed as 



C{{qk{n)},{Rkin)},{fik})=^n 



K 



^Qkin) 
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Denote the set of {qk{n)} and {Rk{n)} specified by the remaining constraints in ([7]), dD and ^ as V, the 
Lagrange dual function is expressed as 



9{{lik}) = ^ , ™i^^l £({gfc(n)},{fifc(n)},{/ifc}). 
The dual problem of the original (primal) problem can be then expressed as 



(11) 



max g{{fik})- 

Aife>0,A;eWNDC 



(12) 



Because the primal problem is convex and also satisfies the Slater's condition [20]!^ the duality gap between the 
optimal value of the primal problem and that of the dual problem becomes zero. This suggests that the problem 
at hand can be equivalently solved in its dual domain by first minimizing the Lagrangian L to obtain the dual 
function fi'd/ifc}) for some given {/-ife}, and then maximizing (^({/ifc}) over {/ifc}- 

Considering first the minimization problem in (fTTl ) to obtain fi'd/Ufc}) for some given {/ifc}. It is interesting to 
observe that this problem can be solved by considering parallel subproblems each corresponding to one fading 
state n. From ( fTOl ). the subproblem for fading state n can be written as (P2) 

K 
Minimize ^gfc("-)- ^ fJ-kRkin) (13) 

fc=i keUsuc 

Subject to -Rfe(n) > Rl, \/k G Ut,c (14) 

i?(n)GCr^(te(n)},{^I(n)}) (15) 

qk{n) > yk. (16) 

Hence, the dual function g{{fJ,k}) can be obtained by solving subproblems all having the identical structure, a 
technique usually referred to as the Lagrange -dual decomposition. For solving Problem P2 for each n, a new 
set of positive dual variables 6k{n), k e Udq, are introduced for DC users w.r.t. their constant-rate constraints in 
(IT4l) . The Lagrangian of Problem P2 can be then expressed as 

K 

Cn{{qk},{Rk},{h)}) = J2^k- Y. i^kRk- Y. h{Rk-Rl)- (IV) 

k=l /cGWndc keUoc 

Note that for brevity, the index n is dropped in qk{n), Rk{n) and 5k{n) in (fTTl) since it is applicable for all n. 
The corresponding dual function can be then defined as 

9n{{h}) = ^ mill Cn{{qk},{Rk}A^k}), (18) 

{gfc,flfc}GX'„ 

where Vn denotes the set of {g^} and {Rk} specified by the remaining constraints (ITSl l and (IT6l ) at fading state 
n. The associated dual problem is then defined as 

. ^^^^^/ 9n{{5k})- (19) 

Similar like PI, Problem P2 can also be solved by first minimizing £„ to obtain the dual function gn{{^k}) for 
a given set of {5k}, and then maximizing gn over {5k}- From (fTTl ). the minimization problem in (fTSl ) can be 

'slater's condition requires that tlie feasible set of the optimization problem has non-empty interior, which is in general the case for 
Problem Pi because with sufficiently large average transmit power, any finite user rates {-R^} are achievable. 



10 
expressed as 

K 
Minimize ^^ ^fc - ^^ Mfc^fc - X^ ^kRk (20) 

fc = l fcgWNDC keUuc 

Subject to i? E Clf AC J^i^^i^ i^t }^ (21) 

gfc > VA;. (22) 

Let /3fc = /Lifc, if /c € ^NDC. and /3fc = 5^, if /c G Z^dc. and vr be a permutation over {!,... ,K} such that 
/?7r(i) ^ /?7r{2) > • • • > I3-k{k)^ ^nd let /37r{i<:+i) — 0. Thanks to the polymatroid structure of C^^'^ [3], the above 
problem can be simplified as (P3) 



K K 



Minimize X] ^'^ ~ X] (^'^C^) ~ /^^(fc+i)) log2 



fc=l A:=l 



k 



^K(i)^n{i)q7T{i)+I 



i=l 



(23) 



Subject to qk>Oyk. (24) 

Problem P3 is convex because all the constraints are affine and the objective function is convex w.r.t. {qk}- 
Hence, this problem can be solved, e.g., by the interior-point method [20]. In Appendix Jl an alternative method 
based on the block-coordinate decent principle [21] by iteratively optimizing qj. with all the other {qk'}, k' ^ k 
as fixed is presented. This method can be considered as a generalization of the algorithm described in [22], where 
all Pn{k),k = 1, . . . , K, are equal. 

So far, solutions have been presented for the minimization problem in (ITTT ) to obtain (^({/ifc}) for some given 
{fj-k} and that in (|T8] ) to obtain each gn{{^k}) for some given {Sj^}. Next, the remaining issue on how to update 
{/ife} to maximize (^({/ifc}) for the dual problem in ([T2l) is addressed. Similar techniques can also be used for 
updating {6k} to maximize gni{Sk}) for each dual problem in ( [T9l ). From ([TOb and (fTTI) . it is observed that the 
dual function g{{lJ.k}), though affine w.r.t. {/Ufc}, is not directly differentiable w.r.t. {fik}- Hence, standard method 
like Newton method cannot be employed to update {/ifc} for maximizing (^({/ifc}). An appropriate choice here 
may be the sub-gradient-based method [21] that is capable of handling non-differentiable functions. This method 
is an iterative algorithm and at each iteration, it requires a sub-gradient at the corresponding value of {fik} to 
update {fik} for the next iteration. Suppose that after solving Problem P2 for some given {/Xjt} at all fading 
states of n, the obtained rates and powers are denoted by {R'j^{n)} and {g^(n)}, respectively, k € Z//ndc- The 
following lemma then provides a suitable sub-gradient for {/ifc}: 

Lemma 3.1: If C{{q'j.{n)}, {i?^(n)}, {/Xfe}) = g{{fik}), then the vector u defined as Uk = Rl — E„[i?^(n)] for 
k G Z//ndc is a sub-gradient of g at {/Ufc}. 
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Proof: Since for any set of ^^'s, Ok > 0,k G Z/^ndc> it follows that 

gm}) < C{{q',{n)},{R',{n)},{ek}) (25) 

= 5(K})+ E iOk-f^k){Rl-^n[Rk{n)]). (26) 

A:GWndc 

Hence, it is clear that the optimal dual solution {/i^} that maximizes g{{fJ,k}) should not lie in the half space 
represented by {0 : J2keU (^'^ ~ Mfc)^fe < 0}. As a result, any new update of {/Ufc} for the next iteration, 
denoted by {/.--}, should satisfy EkeU.^^ if^^ -f^kH>0. U 

By applying Lemma [3]T] a simple rule for updating {fik} is as followso 

fiT'" = {fik + ^{Rl-^n[R'k{n)])}^, k£U^BC, (27) 

where A is a small positive step size. Similarly, at each fading state n, {6k{n)} can be also updated towards 
maximizing gn{{Sk{n)}) as 

Sr'^{n) = {Sk{n) + A{Rl-R',{n))y, keU^c, (28) 

where {i?^(n)}, k € Z//dc> is the solution obtained after solving Problem P3 at fading state n. 

To summarize, the proposed solution is implemented by a two-layer Lagrange-duality method. At the first layer, 

the algorithm searches iteratively for {/i^} with which the average-rate constraints of all NDC users are satisfied. 

At each iteration, an update for {fik} is generated and then passed to the second layer where the algorithm starts 

a parallel search for {5|(n)}, each for a fading state n, such that the constant-rate constraints of all DC users 

are satisfied at all fading states. The resultant {i?^(r^)} of NDC users is then passed back to the first layer for 

another update of {fik}- The overall algorithm is summarized in TableJl The complexity of this algorithm can be 

derived as follows. Supposing that the ellipsoid method is used to iteratively update dual variables, the required 

number of iterations for convergence has O(m^), where m is the size of the problem. Let Kndc and -ftToc denote 

the size of Z^ndc and Udq, respectively, where i^NDC + -f^DC = K. Therefore, the ellipsoid method will need 

0{K'^Y)q) iterations for obtaining {fJ-l} and 0{NK'^q) iterations for obtaining {5^(r^)} for all fading states, 

assuming the number of fading states n is finite and is equal to N. The complexity for solving Problem P3 is 

0{K'^) by, e.g., the interior-point method. Hence, the total complexity of the algorithm is 0{K'^-q(^K'^qK'^ N) . 

^Notice that a more efficient sub-gradient-based method to iteratively find {^fe} is the ellipsoid method [23], which at each iteration 
removes the half space specified by {6 : X^tgw (^* ~ tJ'k)vk < 0} for searching {/ife}. 
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At last, take note that the proposed algorithm jointly optimizes transmit powers and rates together with decoding 
orders (determined by the magnitudes of {/U^} and {6l{n)} [3]) of users at all fading states for the dual fading 
SIMO-MAC. By the BC-MAC duality result [17], the optimal transmit powers, rates, precoding vectors as well 
as encoding orders of users for the original fading MISO-BC can be obtained. 

C. Online Algorithm 

One important issue yet to be addressed for implementing the proposed solution for Problem PI is how to relax 
its assumption on perfect knowledge of the distribution of fading state n. Notice that this knowledge is necessary 
for computing the achievable average rates {E„[i?^(n)]}, k e Z//ndc> which are needed for updating the dual 
variables {fik} for NDC users in (l27l) . In practice, although it is reasonable to assume each user's fading channel 
is stationary and ergodic, the space of fading states is usually continuous and infinite and hence it is infeasible 
for the BS to initially acquire the channel distribution information for all users at all fading states. Even though 
this information is available for off-line implementation of the proposed solution, the computational complexity 
becomes unbounded as the number of fading states goes to infinity. Therefore, in this paper a modified "online" 
algorithm is developed that is able to adaptively update {fik} towards {fJ-l} as transmission proceeds over time. 
Let t denote the transmission block index, t = 1,2, • • • . The key for the online algorithm is to approximate 
the statistical average E„[i?^,(n)], k G Z^ndc> at time t by a time average of transmitted rates up to time t — 1, 
denoted hy Rk[t — I], where Rk[t] is obtained as 

Rk[t] = {!-€) Rk[t-l]+eRk[t], (29) 

where Rk[t] is the transmitted rate at time t, and e, < e ^ 1, is a parameter that controls the convergence 
speed of Rk[t] -^ E„[i?j^(n)] as t ^ co. By replacing E„[i?'^(n)] by Rk[t — 1] in (|27] ). {/ifc[t]} at time t can be 
updated accordingly such that as t —>■ co it converges to {/x|.} because Rk[t] — > Rl,yk G Z//ndc- This modified 
online algorithm is summarized in Table JI] 

Cross-Layer Implementation: The proposed online algorithm based on the Lagrange-duality method is 
amenable to cross-layer implementation of both PHY-layer transmission and MAC-layer multiuser rate scheduling. 
One challenging issue for cross-layer optimization is on how to select useful and succinct information for different 
layers to exchange and share so as to optimize their individual operations. The Lagrange-duality method provides 
a new design paradigm for efficient cross-layer information exchange. On the one hand, since the MAC-layer has 
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the knowledge of user rate demands {Rl} as well as their delay requirements (NDC or DC), it can update dual 
variables {/ifc[i]} and {(5fc[t]} accordingly (see Table HIl). and then pass them to the PHY-layer for computing the 
desirable transmission rates of users {-Rfc^} to meet with each user's specific rate demand. On the other hand, 
the PHY-layer is able to provide the MAC-layer the updated {i?fc[t]} that optimize the PHY-layer transmission 
by solving Problem P3 given {/^fc[^]} and {(5fc[t]}. Therefore, by exchanging dual variables and transmission rates 
between PHY- and MAC-layers, cross-layer dynamic resource allocation can be efficiently implemented. 

Comparison with PFS: It is interesting to draw a comparison between the proposed online algorithm and the 
well-known PFS algorithm. PFS is designed for real-time multiuser rate scheduling in a mobile wireless network 
to ensure some certain fairness for user rate allocation while maximizing the network throughput. PFS applies to 
packet data transmission and hence is equivalent to the transmission scenario considered in this paper when only 
NDC users are present. At each time t, transmit rates {i?fc[t]} assigned to users by PFS maximize the weighted 
sum-rate of users X^fci^fe[i]^fc[i], where the weights are given by ujk.[t] = ^ },_ , and Rk[t — 1] is the estimated 
average rate for user k up to time t — 1 the same as expressed in (l29l ). Using this rule, it has been shown (e.g., 
[13]-[16] and references therein) that as t — > cxo, Rk[t] —>■ Rl where R^ is the average achievable rate for user 
k in the long term. Furthermore, PFS maximizes ^^ log(i?^) over the expected capacity region (please refer 
to Definition 14. 1 1 in Section Hvl) and, hence, {Rl} can be considered as the unique intersection of the surface 
specified by Ha,'^*: — ^ ^'i^ the boundary of the expected capacity region. Because of the log(-) function, the 
rate assignments among users by PFS are regulated in a balanced manner such that no user can be allocated an 
overwhelmingly larger rate than the others even if it has a superior channel condition. However, the achievable 
rates {-R^.} by PFS are not guaranteed to satisfy any desired rate demand of users. In contrast, the proposed 
online algorithm ensures that each NDC user's average-rate demand is satisfied by applying a different rule (see 
Table HIl) for updating user weights {/ifc[i]} for the resource allocation problem (see Problem P3) to be solved at 
each t. 

IV. Throughput-Delay Tradeoff 

For a single-user BF channel, the expected capacity [24], [25] and the delay-limited capacity [4] can be 
considered as the fading channel capacity limits under two extreme cases of delay constraint. The former is 
always larger than the latter and their difference, termed the delay penalty, then characterizes a fundamental 
throughput-delay tradeoff for dynamic resource allocation over a single-user fading channel. The delay penalty 
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may or may not be significantly large for a single-user fading channel. For example, for a SISO-BF channel, 
the delay penalty can be substantial because the delay-limited capacity is indeed zero if the fading channel is 
not "invertible" with a finite average transmit power [26]. However, when multi-antennas are employed at the 
transmitter and/or receiver, the delay-limited capacity of the MIMO-BF channel can be very close to the channel 
expected capacity [27], i.e., a negligible delay penalty. This result can be explained by the "channel-hardening" 
effect [28] for random MIMO channels, i.e., the mutual information of independent MIMO channels becomes 
less variant because of antenna-induced space diversity, and hence the value of power and rate adaptation over 
time vanishes as the number of antennas becomes large. This result indicates that from an information-theoretic 
viewpoint, MIMO channels are highly suitable for transmission of real-time and delay-constrained data traffic. 

Characterization of the delay penalty in a multiuser fading channel is more challenging. The capacity definitions 
for the single-user fading channel can be extended to the multiuser channel as the expected capacity region under 
no delay constraint for all users, and the delay-limited capacity region under zero-delay constraint for all users. 
Therefore, the delay penalty can be measured by directly comparing these two capacity regions. Since capacity 
region contains the achievable rates of all users, it lies in a K-dimensional space where K is the number of 
users in the network. As a result, characterization of capacity region becomes inconvenient as K becomes large. 
In order to overcome this difficulty, prior research work usually adopts the maximum sum-rate of users over the 
capacity region, termed the sum capacity, as a simplified measure for the network throughput. Applying the sum 
capacity to the corresponding capacity region, the expected throughput and the delay-limited throughput can be 
defined accordingly. However, the conventional sum capacity does not consider the rate allocation among users. 
As a consequence, each user's allocated rate portion in the expected throughput may be different compared to 
that in the delay-limited throughput. Hence, a similar measure for the delay penalty like in the single-user case 
by taking the difference between the expected and delay-limited throughput looks problematic at a first glance 
in the multiuser case. 

This section presents a novel characterization of the fundamental throughput-delay tradeoff for the fading 
MISO-BC. Instead of considering mixed NDC and DC transmission like Section Hill it is assumed here that there 
are only NDC or DC users present, and comparison of the network throughput under these two extreme cases of 
delay constraint is of interest. Because it is hard to compare directly the expected and the delay-limited capacity 
region as the number of users becomes large, the sum capacity is also considered for simplicity. However, unlike 
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the conventional sum capacity that does not guarantee the amount of rate allocation among users, the expected and 
delay-limited throughput in this paper are defined under a new constraint that regulates each user's allocated rate 
portion based upon a prescribed rate-profile. Then, the delay penalty is characterized by the difference between 
the expected and delay-limited throughput under the same rate-profile. The above concepts are more explicitly 
defined as follows: 

Definition 4.1: The expected capacity region for a fading MISO-BC expressed in ([T|| under a LTPC p* can be 
defined as 

Ce{p*) = {r£ M^' : n = E„[i?fc(n)], V/c, i?(n) G C^^{p{n),{hkin)}),yn,En[pin)] < p*} . (30) 

Similarly, the delay-limited capacity region is defined as 

Cdivl = {reR'^:rk = Rk{n),yk,n, R{n) G C^'^{p{n),{hk{n)}),yn,En\p{n)] < p*} . (31) 

Definition 4.2: Let R^, denote k-th user's rate demand (average-rate for a NDC user or constant-rate for a DC 
user) k = 1, . . . ,K, the rate profile is defined as a vector a = {ai, . . . , ax}, where a^ = ^k ''' r,, , k = 1, . . . ,K. 

Definition 4.3: The expected throughput Ce{p*,cx) (delay-limited throughput Cd(p*,Q;)) associated with a 
prescribed rate-profile a under a LTPC p* is defined as the maximum sum-rate of users over the expected 
(delay-limited) capacity region under the constraint that the average (constant) transmit rate of each user r^ must 
satisfy I = f-,Vi,jG{l,...,/^}. 

Definition 4.4: For some given delay profile a and LTPC p*, the delay penalty Cdp(p*,Q!) is equal to 
Ce{p\cx) -Cd(p*,Q). 

The proposed definition for delay penalty is illustrated in Fig. [3] for a 2-user case. From Definition |4.4[ it is 
noted that characterization of Cdp(p*,Q:) requires that of both expected and delay-limited throughput. In the 
next, we present the algorithm for characterizing Ce(p*, a) for some given p* and ot. Similar algorithm can also 
be developed for characterizing Cd(p*,Q:) and is thus omitted here for brevity. According to Definition 14.31 and 
(l30l) and using the BC-MAC duality result in Section |lll the expected throughput Ce{p*,cy.) can be obtained by 
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considering the following optimization problem (P4): 



Maximize 
Subject to 



En[i?fe(n)] > CeUk, 

R{n)^C^'^''({qk{n)},{h),{n)}],^n 



Qk{n) > Vn, k. 
K 



¥.. 



lk=l 



^ * 
< p 



(32) 
(33) 
(34) 
(35) 
(36) 



Similar like Problem PI, it can be verified that the above problem is also convex, and hence can be solved 
using convex optimization techniques. Here, instead of solving Problem P4 from a scratch, the proposed solution 
transforms this problem into a special form of Problem PI and hence the same algorithm for Problem PI can be 
applied. First, considering the following transmit power minimization problem (P5) for support of any arbitrary 
set of average -rate demands {-R^} that satisfy a given rate -profile constraint a., i.e., -^t = §^, Vi, j G {1, . . . , K\. 

- K 



Minimize 
Subject to 



n 



En 

.k=l 

En[Rk(n)] > RtnmO^k, V/c, 



i^(n)GC^C^te(n)},{^t(^)}j^ Vn 
Qki''^) ^ Vn, k. 



(37) 
(38) 
(39) 
(40) 



Note that i?sum — Sfe=i ^k- ^^ ^^ observed that the above problem is a special case of Problem PI if all users 
have NDC transmission, i.e., those constant-rate constraints for DC users in (|7]) are removed. Hence, Problem P5 
can also be solved by the proposed algorithm in Table U Let q* denote the minimal transmit power obtained after 
solving Problem P5. Notice that Cc{p*,a) is a non-decreasing function of p* with some given a because the 
expected capacity region Ce{p*) corresponding to a larger p* always contains that with a smaller p*. Hence, if 
p* > q* , it can be inferred that Ce(p*, a) must be larger than the assumed Rl^^. Otherwise, Ce{p*, a) < R^um- 
By using this property, Cc{p*,a) can be easily obtained by a bisection search [20]. 

Fairness Penalty: There is an interesting relationship between the conventional sum capacity over the expected 
capacity region, and the expected throughput as a function of delay-profile, as shown in Fig. |4]for a 2-user case. 
The sum capacity is obtained by maximizing the sum-rate of users over the expected capacity region so as 
to maximally exploit the multiuser diversity gain in the achievable network throughput. However, it does not 
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guarantee the resultant rate allocation among users. Let the resultant rate portion allocated to each user in the 
sum capacity be specified by a rate-profile vector a*. In contrast, the expected throughput maximizes the sum-rate 
of users under any arbitrary rate-profile vector etc- Due to this hard fairness constraint, the expected throughput 
in general is smaller than the sum capacity if etc is different from a*, and their difference can be used as a 
measure of the fairness penalty, which is denoted by C-p-p{p* ,cxe) — Ce{p*,a*) — Cc{p* ,oi-e)- Similarly, the 
fairness penalty can also be defined in the delay-limited case. 

Asymptotic Results: Consider the MISO-BC with asymptotically large number of users K but fixed number 
of transmit antennas M at the BS. It is assumed that the network is homogeneous where all users have mutually 
independent but identically distributed channels, and have identical rate demands, i.e., au = -^, VA;. Under these 
assumptions, in [29] it has been shown that as i^ — > cxd the expected throughput Ce under any finite power 
constraint p* scales like Mlog2log A' + 0{\). In the following theorem, we provide this asymptotic result for 
the delay-limited throughput: 

Theorem 4.1: Under the assumption of symmetric fading and symmetric user rate demand, the delay-limited 
throughput Cd for a fading MISO-BC under a LTPC p* is upper-bounded by f^ ^ as i^ ^ oo, where p is a 
constant depending solely on the channel distribution. 

Proof: Please refer to Appendix lUl ■ 

The above results suggest very different behaviors of the achievable network throughput with NDC versus DC 
transmission as K becomes large. On the one hand, for the expected throughput, transmission delay is not an 
issue and hence the optimal strategy is to select only a subset of users with the best joint channel realizations 
for transmission at one time. The expected throughput thus scales linearly with M and in double-logarithm with 
K, for which the former is due to spatial multiplexing gain and the latter arises from the multiuser-diversity 
gain. On the other hand, in the delay-limited case, a constant-rate transmission needs to be ensured for all users. 
Consequently, as the number of users increases, though more degrees of freedom are available for optimizing 
transmit parameters such as precoding vectors and encoding order of users, the delay-limited throughput is 
eventually saturated. The above comparison demonstrates that for a SDMA-based network with a large user 
population, transmission delay can be a critical factor that prevents from achieving the maximum asymptotic 
throughput. However, notice that this may not be the case for the network having similar M and K, as will be 
verified later by the simulation results. 
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V. Simulation Results 

In this section, simulation results are presented for evaluating the performances of dynamic resource allocation 
for the fading MISO-BC under various transmission delay considerations. Since the network throughput is 
contingent on transmit delay requirements as well as many other factors such as number of transmit antennas 
at the BS, number of mobile users, user channel conditions and rate requirements, and the transmit power 
constraint at the BS, various combinations of these factors are considered in the following simulations with 
an aim to demonstrate how transmission delay interplays with other factors in determining the achievable 
network throughput. The simulation results are presented in the following subsections. Note that in the following 
simulation, user channel vectors {hk{n)} are independently generated from the population of CSCG vectors, and 
if not stated otherwise, it is assumed that hk{n) ~ CJ\f{0,I),\/k. 

A. Transmit Optimization for Mixed NDC and DC Traffic 

First, consider a MISO-BC with M = K = 4 with two users having NDC transmission and the other two 
having DC transmission. For convenience, it is assumed that the target average transmit rates for the two users with 
NDC transmission are both equal to ^ndc ^^^ ^^^ target constant rates for the two users with DC transmission 
both equal to R^q. Let 7 denote the loading factor representing the ratio of the total amount of NDC traffic to 
that of the sum of NDC and DC traffic, i.e., 7 = -p^ — ^Th* ■ If the proposed algorithm in Table U that achieves 
optimal dynamic resource allocation is used, the required average transmit power at the BS is expected to be the 
minimum for satisfying both NDC and DC user rate demands for any given 7. For purpose of comparison, two 
suboptimal transmission schemes are also considered: 

• Time-Division-Multiple-Access (TDMA): A simple transmission scheme is to divide each transmission period 
into K consecutive equal-duration time-blocks, each dedicated for transmission of one user's data traffic. If 
coherent precoding is applied at each fading state n, i.e., the precoder for user k, defined as bk{n) = , ''^"'^ , 
is equal to < '°^"'| , Vfc, it can be shown that the MISO-BC is decomposable into K single-user SISO 
channels. Depending on each user's delay requirement, conventional water-filling power control [24] and 
channel-inversion power control [27] can be applied over different fading states for users with NDC and 
DC transmission, respectively, to achieve the minimum average transmit power under the given user rate 
demand. 
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• Zero-Forcing (ZF) -Based SDMA: Another possible transmission scheme is also based on the SDMA 
principle, i.e., supporting all user's transmission simultaneously by multi-antenna spatial multiplexing at 
the BS. However, instead of using the dirty-paper-coding (DPC)-based non-linear precoding assumed in this 
paper, a simple ZF-based linear precoding is employed at the BS. The ZF-based precoder bfc(n) for user k at 
each fading state n maximizes the user's own equivalent channel gain \\hk{n)bk{n)\\'^ subject to the constraint 
that its associated co-channel interference must be completely removed, i.e., hk,{n)bk{n) = 0, V/c' ^ k. Like 
TDM A, ZF-based precoding also decomposes the MISO-BC into K (assuming K < M) single-user SISO 
channels, and thereby optimal single-user power control schemes can be applied. 
In Fig. [5] and Fig. [6l these three schemes are compared for two cases of network throughput, one corresponding 
to the sum-rate of users 2R^y)c + ^^ndc ~ ^ bits/complex dimension, and one corresponding to 2 bits/complex 
dimension. The required average transmit power over 500 randomly generated channel realizations is plotted 
versus the loading factor 7. First, in Fig. [5] it is observed that in the case of high throughput (bandwidth- 
limited), ZF-based SDMA outperforms TDMA because of its larger spectral efficiency by spatial multiplexing. 
However, in the case of low throughput (power-limited), in Fig. [6] it is observed that TDMA achieves better power 
efficiency than ZF-based SDMA. This is because coherent precoding in TDMA provides more diversity and array 
gains, which become more dominant over spatial multiplexing gains at the power-limited regime. Secondly, it is 
observed that in both cases of high and low throughput, the proposed scheme always outperforms both TDMA 
and ZF-based SDMA given any loading factor 7. This is because the proposed scheme optimally balances the 
achievable spatial multiplexing, array, and diversity gains for the fading MISO-BC. Thirdly, it is observed that 
for all schemes, the required transmit power associated with some 7, 7 < 0.5, is always larger than that with 
1 — 7 (e.g., comparing 7 = 0.1 and 7 = 0.9), i.e., given the same portion of data traffic in the total traffic, 
NDC traffic has a better power efficiency than DC traffic. This is because NDC traffic allows for more flexible 
dynamic resource allocation than DC traffic and thus leads to a better power efficiency. 

B. Convergence of Online Algorithm 

The convergence of the online algorithm in Table |II] is validated by simulations. For simplicity, it is assumed 
that only NDC users are present in the network. A MISO-BC with K = 2 and M = 4 is considered. The target 
average-rates for user-1 and user-2 are 3 and 1 bits/complex dimension, respectively. The online algorithm is 
implemented for 3000 consecutive transmissions with randomly generated channel realizations. Initially, the dual 
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variables are set as ^i[0] = /i2[0] = 1, and the estimated average transmit rates as ^i[0] = -R2[0] = 0. The 
updates for {^^[t]} and {/^^[t]} as t proceeds are shown in Fig. |7] and Fig. [H respectively, for A = e = 0.01. 
It is observed that the onUne algorithm converges to the optimal dual variables and target average-rates for both 
users after a couple of hundreds of iterations. Simulations results (not shown in this paper) for other values of A 
and e indicate that in general a larger step size leads to a faster algorithm convergence but also results in more 
frequent oscillations. 

C. Throughput-Delay Tradeoff 

Fig. |9] compares the network throughput under two extreme cases of delay constraint considered in this paper, 
namely, the expected throughput and the delay-limited throughput. It is assumed that M = 2, and two types of 
networks are considered: One is a 2-user network with user rate profile o; = [| i]; and the other is a 4-user 
network with ot = [| | ^ |]. Notice that the ratio of rate demand between any of the first two users and any of 
the last two users in the second case is the same as that between the two users in the first case, which is equal to 
2. First, it is observed that for both networks, the delay penalties are only moderate for all considered transmit 
power values. The delay penalty increases as K becomes larger than M, but only slightly. Small delay penalties 
in both cases can be explained by extending the multi-antenna channel hardening effect [28] in the single-user 
case to the fading MISO-BC, i.e., as the number of degrees of fading increases with M for some constant K, 
not only the mutual information associated with each user's channel, but also the whole capacity region of the 
BC becomes less variant over different fading states. Therefore, the sum-rate of users for any given rate profile 
also changes less dramatically, and as a consequence, imposing a set of strict constant-rate constraints at each 
fading state (equivalent to a fixed rate -profile) does not incur a large throughput loss for the fading MISO-BC. 

Secondly, it is observed that the expected throughput for a 4-user network outperforms that for a 2-user 
network given that both networks have the similar allocated rate portion among users. This throughput gain can 
be explained by the well-known multiuser diversity effect [13] for NDC transmission. As the number of degrees 
of fading increases with K for some constant M, the BS can select relatively fixed number of users (around M) 
with the best joint channel realizations from a larger number of total users for transmission at each time. Thereby, 
the network throughput is boosted provided that each user has sufficient delay tolerance. On the other hand, it 
is observed that, maybe more surprisingly, the delay-limited throughput for the 4-user network also outperforms 
that for the 2-user network under the similar allocated rate portion among users. Notice that in this case, all 
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user's rates need to be constant at each fading state, and consequently, adaptive rate allocation for achieving the 
expected throughput does not apply here. This throughput gain in the DC case can also be explained by a more 
generally applied multiuser diversity effect. Even though a set of constant-rates of users need to be satisfied at 
each fading state in the DC case, a larger number of users provides the BS more flexibility in jointly optimizing 
allocation of transmit resources among users based on their channel realizations. This multiuser diversity gain 
becomes more substantial as K increases because a new user can bring along additional M degrees of fading. 
However, it is also important to take note that the multiuser diversity gain in the delay-limited case does have 
certain limitation, e.g., as K becomes overwhelmingly larger than M, the delay-limited throughput eventually 
gets saturated (see Theorem |4. lb . 

D. Throughput-Fairness Tradeoff 

At last, the tradeoff between the achievable network throughput and the fairness for user rate allocation is 
demonstrated. The NDC transmission is considered and hence the expected throughput is of interest. A network 
with 2 users is considered and it is assumed that M = 2, and the average LTPC at the BS is fixed as 10. Two 
fading channel models are considered: One has symmetric fading where both user channel vectors h}^{n), k = 1,2, 
are assumed to be distributed as CN{Q,I); and the other has asymmetric fading where hi{n) ~ CN{Q,2I), 
and h2{n) ~ CM{0, |/). Notice that the asymmetric-fading case may correspond to a near-far situation in the 
cellular network where user-1 is closer to the BS and hence has an average channel gain of approximately 
20 X logio4 = 12 dB compared to user-2. Let (p denote the ratio of average-rate demand between user-1 and 
user-2, i.e., for the corresponding rate profile a, ^ = (p. In Fig. \T0\ the expected throughput is shown as a 
function of (p for both symmetric- and asymmetric-fading cases. It is observed that in the symmetric-fading 
case, a strict fairness constraint for equal rate allocation among users, i.e., (j) = 0.5, also corresponds to the 
maximum expected throughput or the sum capacity. In contrast, for the case of asymmetric fading, the maximum 
expected throughput is achieved when (j) = 0.7, i.e., user-1 is allocated 70% of the expected throughput because 
of its superior channel condition. However, in the latter case, a strict fairness constraint with (p = 0.5 yields a 
throughput loss of only 0.3 bits/complex dimension. This small fairness penalty can be explained by observing 
that the expected throughput for the fading MISO-BC under both symmetric and asymmetric fading is quite 
insensitive to at a very large range of its values, indicating that by optimizing resource allocation, transmission 
with very heterogeneous rate requirements may incur only a negligible network throughput loss. 
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VI. Concluding Remarks 

This paper investigates the capacity limits and the associated optimal dynamic resource allocation schemes 
for the fading MISO-BC under various transmission delay and fairness considerations. First, this paper studies 
transmit optimization with mixed delay-constrained and delay-tolerant data traffic. The proposed online resource 
allocation algorithm is based on a two-layer Lagrange-duality method, and is amenable to real-time cross-layer 
implementation. Secondly, this paper characterizes some key fundamental tradeoffs between network throughput, 
transmission delay and user fairness in rate allocation, and draws some novel insights pertinent to multiuser 
diversity and channel hardening effects for the fading MISO-BC. This paper shows that when there are similar 
numbers of users and transmit antennas at the BS, the delay penalty and the fairness penalty in the achievable 
network throughput may be only moderate, suggesting that employing multi-antennas at the BS is an effective 
means for delivering data traffic with heterogeneous delay and rate requirements. 

The results obtained in this paper can also be extended to the uplink transmission in a cellular network by 
considering the fading MAC under individual user power constraint instead of the sum-power constraint in this 
paper as a consequence of the BC-MAC duality. Hence, another important factor needs to be taken into account 
by dynamic resource allocation for the fading MAC is the fairness in user transmit power consumption. The 
concept of rate profile in this paper can also be applied to define a similar power profile for regulating the 
power consumption between users for the MAC [11]. Furthermore, although the developed results in this paper 
are under the assumption of capacity-achieving transmission using DPC-based non-linear precoding at the BS, 
they are readily extendible to other suboptimal transmission methods such as linear precoding provided that 
the achievable rate region by these methods is still a convex set and, hence, like in this paper similar convex 
optimization techniques can be applied. 

Appendix I 
Alternative Algorithm for Problem P3 

In Problem P3, the constraints are separable and the objective function is convex. Hence, this problem can 
be solved iteratively by the block-coordinate decent method [21]. At each iteration, this method minimizes the 
objective function w.r.t. one q^ while holding all the other g^'s constant. More specifically, the method minimizes 
(|23] ) w.r.t. qi with constant {q2, ■ ■ ■ ,qK}, and then q2 with constant {gi,(?3, . . . ,qK}, ■ ■ ■, to qx with constant 
{gi, . . . ,qK-i}, and after that the above routine is repeated. Because after each iteration the objective function 
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only decreases, the convergence to the global minimum of the objective function is ensured. Each iteration of the 
above algorithm is described as follows. Without loss of generality, assuming that in (l23l ). 7r{k) = k,k = 1, . . . ,K. 
Considering any arbitrary iteration for minimizing (1231 ) w.r.t. q^, m € {1, . . . , K}, with all the other q^'s constant. 
Problem P3 can be rewritten as 



K 



Minimize Qm - ^ {Pk - Pk+i) log2 



k=m 



k 

hl^hrnqm + ^ h\hiqi + / 



(41) 



Subject to qm > 0. (42) 

By introducing the dual variable A^, A^ > 0, associated with the constraint q^ > 0, the Karush-Kuhn-Tacker 
(KKT) optimaUty conditions [20] state that the optimal pj„ and dual variable AJ^ for this problem must satisfy 

Y,il3k-(3k+i)hmUlh^q*^+ Yl hlhiqi + l\ hi = (1- a:;,) log 2, (43) 

k=m \ i=l,i^m J 

Kn€n = 0. (44) 

Let d{q^) denote the function on the left-hand-side of (|43]l. From (011), it is inferred that g^ > only if A;^ = 0. 

From (l43l) and by taking note that d{q^) is a non-increasing function of q^ for q^ > 0, it follows that q^ > 

occurs only if d{0) > log 2. Thus, it follows that 

if d(0) < log 2 

^ ' ~ (45) 

go otherwise, 

where go is the unique root for d{q^) = log 2. go can be easily found by a bisection search [20] over [0, j^^], 

where the above upper-bound for go is obtained by the following inequalities and equality: 

log 2 < J2 (/^fc - f^k+l) hm (hlhmq*m + A ^Jn (46) 
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Appendix II 
Proof of Theorem 14. II 

First, we obtain an upper-bound for the delay-limited throughput Cd by assuming that there is no co-channel 
interference between users, as opposed to the successive interference pre-subtraction by DPC. Under this assump- 
tion, the fading MISO-BC is decomposed into K parallel single-user fading channels. Let pk denote the average 



24 

transmit power assigned to user k, k = 1, . . . ,K. The maximum constant-rate achievable over all fading states 
(or the so-called delay-limited capacity [4]) for user k can be expressed as [27] 

Cdik) = log2 (l + ^) ' (49) 



where pk = E„ 



1 



. Because of the assumed symmetric fading (hence, pk = p, V/c) and symmetric rate 
demand (hence, C^ik) = ^,\/k), it follows that pk = 7^, V/c, achieves the maximum average sum-rate of users. 



Hence, the delay-limited throughput is upper-bounded by 

Cd<i^log2(l + |^), (50) 

which holds for any i^ > 1. By taking the limit of the right-hand-side of dSOl ) as i^ ^ 00, the proof of Theorem 
14. II is completed. 
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Fig. 1. MISO-BC for SDMA-based downlink transmission in a single-cell of wireless cellular network. 
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Fig. 2. Transmission sclieduling at the MAC-layer for packet data and circuit data. 




Fig. 3. Illustration of the delay penalty Cdp(p*, ol)- 







Fig. 4. Illustration of the fairness penalty C-p-pijp* , etc) in the expected capacity region C^ijp* 
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-^ ZF-Based SOMA 




Fig. 5. Comparison of the average transmit powers for different schemes under mixed NDC and DC transmission. The total amount of 
NDC and DC traffic is 6 bits/complex dimension. 
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Fig. 6. Comparison of the average transmit powers for different schemes under mixed NDC and DC transmission. The total amount of 
NDC and DC traffic is 2 bits/complex dimension. 
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Fig. 7. The estimated average transmit rate Rk[t] at different time t obtained by the online algorithm. The target average-rate for user-1 
and user-2 are 3 and 1 bits/complex dimension, respectively. 
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Fig. 8. The dual variable /ifc[t] at different time t obtained by the online algorithm. 
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-e- Expected Throughput, M=2, K=4 

Delay-Limited Throughput, IV1=2, K=4 
-O- Expected Throughput, M=2, K=2 
-^- Deiay-Limited Throughput, M=2, K=2 
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Fig. 9. Comparison of the expected througliput and tlie delay-limited throughput. 
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Fig. 10. Comparison of the expected throughput under different fairness constraints. 



